Concept Recognition and the TREC Genomics Tasks
نویسندگان
چکیده
We applied concept recognition techniques to the Genomics track primary and secondary tasks. For the primary task, we developed a foundational information retrieval system which incorporated Entrez Gene entries and UMLS concepts for query expansion via phrasal and term boosting representations of synonyms. For the secondary task, we evaluated three conceptual features—mouse strain names, indexed MeSH terms, and normalized citations—in addition to two surface linguistic features—BOW and bigrams. Our final feature set yielded consistently high F-measures.
منابع مشابه
Concept Based Document Retrieval for Genomics Literature
The 2006 TREC Genomics evaluation focuses on document, passage and aspect retrieval in the genomics domain. The Erasmus Medical Center, TNO and University of Twente collaborated on an approach combining concept tagging (named entity recognition) and information retrieval based on statistical language models. Experiments on the 2004 collection show that document retrieval based on concepts could...
متن کاملBioText Team Report for the TREC 2003 Genomics Track
The BioText project team participated in both tasks of the TREC 2003 genomics track. Key to our approach in the primary task was the use of an organism-name recognition module, a module for recognizing gene name variants, and MeSH descriptors. Text classification improved the results slightly. In the secondary task, the key insight was casting it as a classification problem of choosing between ...
متن کاملTREC 2005 Genomics Track Experiments at DUTAI
This paper describes the techniques we applied for the two tasks of the TREC Genomics track, i.e., ad hoc retrieval and categorization tasks. For the ad hoc retrieval task, we used query expansion, different scoring strategy on different parts of Medline record (Title, Abstract, RN, MH, etc.) and pseudo relevance feedback. Our submitted run DUTAdHoc2 obtained a MAP of 0.2349. For the categoriza...
متن کاملMeSH Based Feedback, Concept Recognition and Stacked Classification for Curation Tasks
This paper reports about experiments carried out in the context of the genomics track at TREC 2004. Experiments were concentrated on two subtasks: the ad hoc retrieval task and the triage task. Experiments for the ad hoc task aimed at improving a standard full-text ad-hoc run (using a language modeling approach) by exploiting the manual classification of MEDLINE abstracts (the MeSH terms) for r...
متن کاملTREC GENOMICS Track Overview
The first year of TREC Genomics Track featured two tasks: ad hoc retrieval and information extraction. Both tasks centered around the Gene Reference into Function (GeneRIF) resource of the National Library of Medicine, which was used as both pseudorelevance judgments for ad hoc document retrieval as well as target text for information extraction. The track attracted 29 groups who participated i...
متن کامل